A transposon-based strategy for sequencing repetitive DNA in eukaryotic genomes.
نویسندگان
چکیده
Repetitive DNA is a significant component of eukaryotic genomes. We have developed a strategy to efficiently and accurately sequence repetitive DNA in the nematode Caenorhabditis elegans using integrated artificial transposons and automated fluorescent sequencing. Mapping and assembly tools represent important components of this strategy and facilitate sequence assembly in complex regions. We have applied the strategy to several cosmid assembly gaps resulting from repetitive DNA and have accurately recovered the sequences of these regions. Analysis of these regions revealed six novel transposon-like repetitive elements, IR-1, IR-2, IR-3, IR-4, IR-5, and TR-1. Each of these elements represents a middle-repetitive DNA family in C. elegans containing at least 3-140 copies per genome. Copies of IR-1, IR-2, IR-4, and IR-5 are located on all (or most) of the six nematode chromosomes, whereas IR-3 is predominantly located on chromosome X. These elements are almost exclusively interspersed between predicted genes or within the predicted introns of these genes, with the exception of a single IR-5 element, which is located within a predicted exon. IR-1, IR-2, and IR-3 are flanked by short sequence duplications resembling the target site duplications of transposons. We have established a website database (http:(/)/www.welch.jhu.edu/approximately devine/RepDNAdb.html) to track and cross-reference these transposon-like repetitive elements that contains detailed information on individual element copies and provides links to appropriate GenBank records. This set of tools may be used to sequence, track, and study repetitive DNA in model organisms and humans.
منابع مشابه
Novel sequencing strategy for repetitive DNA in a Drosophila BAC clone reveals that the centromeric region of the Y chromosome evolved from a telomere†
The centromeric and telomeric heterochromatin of eukaryotic chromosomes is mainly composed of middle-repetitive elements, such as transposable elements and tandemly repeated DNA sequences. Because of this repetitive nature, Whole Genome Shotgun Projects have failed in sequencing these regions. We describe a novel kind of transposon-based approach for sequencing highly repetitive DNA sequences i...
متن کاملSequencing of long stretches of repetitive DNA
Repetitive DNA is widespread in eukaryotic genomes, in some cases making up more than 80% of the total. SSRs are a type of repetitive DNA formed by short motifs repeated in tandem arrays. In some species, SSRs may be organized into long stretches, usually associated with the constitutive heterochromatin. Variation in repeats can alter the expression of genes, and changes in the number of repeat...
متن کاملEfficient capture of unique sequences from eukaryotic genomes.
Cot-based cloning and sequencing (CBCS), a synthesis of Cot analysis, DNA cloning and high-throughput sequencing, promises to accelerate the study of eukaryotic genomes. In particular, CBCS will (1) permit efficient gene discovery in species with substantial quantities of repetitive DNA, (2) allow the sequence complexity (i.e. all the unique sequence information) of large genomes to be elucidat...
متن کاملOne-way sequencing of multiple amplicons from tandem repetitive mitochondrial DNA control region.
Repetitive DNA sequences not only exist abundantly in eukaryotic nuclear genomes, but also occur as tandem repeats in many animal mitochondrial DNA (mtDNA) control regions. Due to concerted evolution, these repetitive sequences are highly similar or even identical within a genome. When long repetitive regions are the targets of amplification for the purpose of sequencing, multiple amplicons may...
متن کاملRepetitive Sequences in Plant Nuclear DNA: Types, Distribution, Evolution and Function
Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 7 5 شماره
صفحات -
تاریخ انتشار 1997